Skip to main content

Polymorphic Data

Polymorphism in TimeBase is the ability of a single stream to contain multiple message types. See Basic Concepts and Messages to learn more.

In the following example we query tickquerydemo stream for all records. As the result, we get messages of different types: TradeMessage and BestBidOfferMessage.

#Shell format

==> SELECT * FROM tickquerydemo

>deltix.timebase.api.messages.TradeMessage,TIMESTAMP,SYMBOL,TYPE,price,size
0,2011-10-17 17:21:41,XBANK,EQUITY,301.25,800.0
>deltix.timebase.api.messages.BestBidOfferMessage,TIMESTAMP,SYMBOL,TYPE,offerPrice,offerSize,bidPrice,bidSize
1,2011-10-17 17:21:42,XBANK,EQUITY,301.5,60000.0,298.5,800.0
2,2011-10-17 17:21:43,GREATCO,EQUITY,45.0,100.0,43.0,400.0
3,2011-10-17 17:21:43,XBANK,EQUITY,299.5,40000.0,295.0,300.0

Polymorphic Union Model and GROUP BY ENTITY

Selecting * means simply selecting the current message - alone. Selecting any other combination of fields implies the construction of a new message type.

For example, we can try and select the combination of price, bidPrice, offerPrice. Even though these fields are picked from two different message types, this is a legal request. Its meaning is "give me the last price and quote values as of the time of each event". The result of such a query is not very useful in our scenario, since it mixes the data of different symbols:

Select from different classes
#Shell format

SELECT price, bidPrice, offerPrice FROM tickquerydemo

>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,_,42.5,43.5
1,2011-10-17 17:21:40,XBANK,EQUITY,_,301.25,301.75
2,2011-10-17 17:21:41,XBANK,EQUITY,301.25,_,_
3,2011-10-17 17:21:42,XBANK,EQUITY,_,298.5,301.5
4,2011-10-17 17:21:43,GREATCO,EQUITY,_,43.0,45.0
5,2011-10-17 17:21:43,XBANK,EQUITY,_,295.0,299.5
6,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,_,_
!end

While executing this query, the QQL engine creates an object called Query State, containing, in our example, three fields: price, bidPrice, offerPrice Initially these fields are set to NULL. As each message comes into the query engine, the corresponding fields are updated. As we know, the first underlying message is: [from the select * query]. This first message causes the first update to the Query State, which we see in the output:

#Shell output format
>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,_,42.5,43.5

As you see, price is still NULL, because we have not seen a TradeMessage yet. The next message in the source data set is another BestBidOfferMessage, this time for XBANK. This new message again updates the bidPrice and offerPrice fields of the Query State, while price is still NULL. The third message (a TradeMessage) finally updates the price field, and so on. In QQL terminology, the Query State object represents a Polymorphic Union Model (PUM) of the source data.

The following diagram illustrates the flow of data in the query we just ran:

As we mentioned, such a query is useless on market data, as it mixes price from one symbol with quote values from another, as seen in messages #4 and #6:

#Shell format
4,2011-10-17 17:21:43,GREATCO,EQUITY,301.25,43.0,45.0
6,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,295.0,299.5

What we want instead is to track the Query State independently for each symbol. Fortunately, there is a very simple way to make the QQL engine behave the way we want, by specifying the GROUP BY ENTITY clause. While in relational SQL the GROUP BY clause is only valid when aggregate values are being selected (such as MAX(), COUNT(), etc.), in QQL this clause has the more general effect of determining how many Query States are created, and how they are indexed.

#Shell format
SELECT price, bidPrice, offerPrice FROM tickquerydemo GROUP BY ENTITY

>_,TIMESTAMP,SYMBOL,TYPE,price,bidPrice,offerPrice
0,2011-10-17 17:21:40,GREATCO,EQUITY,_,42.5,43.5
1,2011-10-17 17:21:40,XBANK,EQUITY,_,301.25,301.75
2,2011-10-17 17:21:41,XBANK,EQUITY,301.25,_,_
3,2011-10-17 17:21:42,XBANK,EQUITY,_,298.5,301.5
4,2011-10-17 17:21:43,GREATCO,EQUITY,_,43.0,45.0
5,2011-10-17 17:21:43,XBANK,EQUITY,_,295.0,299.5
6,2011-10-17 17:21:44,GREATCO,EQUITY,44.0,_,_
!end

Without the GROUP BY ENTITY clause, there is only one Query State object for all symbols. When the GROUP BY ENTITY clause is present, there is a separate Query State object message created for each distinct symbol. Each new source message updates the fields of the corresponding Query State:

Adding GROUP BY ENTITY to our polymorphic query instantly makes the result look the way we expect.